Consistent Logical Checkpointing
نویسنده
چکیده
A \consistent checkpointing" algorithm saves a consistent view of the distributed system state on stable storage. The loss of computation upon a failure can be bounded by taking consistent checkpoints with adequate frequency. The traditional consistent checkpointing algorithms require the diierent processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce the overhead. Some techniques for staggering the checkpoints have been proposed previously 9], however, these techniques result in \limited staggering" in that not all processes' checkpoints can be staggered. Ideally, one would like to stagger the checkpoints arbitrarily. This report presents a simple approach to arbitrarily stagger the checkpoints. Our approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. This report discusses the proposed approach and the implementation issues. The proposed approach was discussed brieey in 11].
منابع مشابه
Staggered Consistent Checkpointing
ÐA consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpointing algorithms require different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce c...
متن کاملCheckpointing yNitin
A consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpointing algorithms require diierent processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce che...
متن کاملOn Staggered Checkpointing
A consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpoint-ing algorithms require diierent processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce ch...
متن کاملFalkirk Wheel: Rollback Recovery for Dataflow Systems
We present a new model for rollback recovery in distributed dataflow systems. We explain existing rollback schemes by assigning a logical time to each event such as a message delivery. If some processors fail during an execution, the system rolls back by selecting a set of logical times for each processor. The effect of events at times within the set is retained or restored from saved state, wh...
متن کاملLazy Checkpointing Coordination for Bounding Rollback Propagation
shown that logging a nondeterministic event equivalently places a logical checkpoint [18] at the end of the ensuing In this paper, we propose the technique of lazy checkstate interval, and these extra logical checkpoints serve to point coordination which preserves process autonomy eliminate the domino effect. while employing communication-induced checkpoint coCoordinated checkpointing achieves ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994